-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[warm-reboot][sad] enhance test to cover different port types and ope… #3853
Conversation
…r state in sad scenarios Signed-off-by: Stepan Blyschak <[email protected]>
Signed-off-by: Stepan Blyschak <[email protected]>
@stepanblyschak Can you please resolve the conflict and update PR? Can we check the port status with selecting based on either the port status before rebooting or the minigraph information? |
@sujinmkang The port selection is based on minigraph information, in the minigraph there are only up ports |
/azpw run |
/azpw run |
1 similar comment
/azpw run |
Is it possible to separate the part of "cover different port types" into a standalone PR? that's a bug fix, not an enhancement. |
@qiluo-msft This is not a bug fix. It was not designed to "cover different port types" based on the code, unfrotunately there is no test plan document to prove it, so I treat it as an enhancement. |
…r state in sad scenarios (sonic-net#3853) ### Description of PR Summary: Enhance warm reboot sad path test cases. Fixes sonic-net#3683 ### Type of change - [x] Test case(new/improvement) ### Approach #### What is the motivation for this PR? The motivation is to fill the test gap to test different ports admin and operationally down before warm-reboot and check their status after warm-reboot. Sanity will make sure the ports changed are back up. #### How did you do it? - For port down cases implemented a special logic for port selection based on port physical properties. - For port down cases implemented also port operational state change. - As of second enhancement in the list there was a neccessary refactoring made, few sad operations were taken out from ptf script and implemented in pytest. Old sad path wasn't changed to support ansible compatiblity. This is partial refactoring only for enabling us to implement first two items from this list. Other sad operations need to be taken to pytest as well in the future. #### How did you verify/test it? Running warm_reboot_multi_sad.
This commit has caused several LAG flaps in the warm reboot sad cases across different platforms. Also, looks like revert function is called twice now (once from ptf script, and once from pytest in the cases that are moved to Pytest) |
@vaibhavhd Hi, could you please share the command line to reproduce? I wonder wether you have expirienced a bug in SONiC teamd warm-reboot that caused exactly this issue - sonic-net/sonic-buildimage#8227. In this case, could you please share the dump after failing test case to verify this? Cases that are moved to pytest replace cases in ptf script. In the code you mentioned self.sad_revert() is called when self.sad_oper and self.sad_handle are set, which is not the case for pytest cases. Please check this part of the code - https://github.com/Azure/sonic-mgmt/pull/3853/files#diff-bfbd05e640c01905f096016ec8fc695dba2364fc44f5e600fd087841d173cbaeR527 |
I am trying to evaluate that if it is a SONiC image issue. I just run Pytest |
@stepanblyschak, the failures that I was earlier seeing were infact due to issue 8227, and were fixed by your PR. When the fix is included in 202012 image, I did not hit the same issues anymore. Thank you for finding and fixing this. |
…r state in sad scenarios (sonic-net#3853) ### Description of PR Summary: Enhance warm reboot sad path test cases. Fixes sonic-net#3683 ### Type of change - [x] Test case(new/improvement) ### Approach #### What is the motivation for this PR? The motivation is to fill the test gap to test different ports admin and operationally down before warm-reboot and check their status after warm-reboot. Sanity will make sure the ports changed are back up. #### How did you do it? - For port down cases implemented a special logic for port selection based on port physical properties. - For port down cases implemented also port operational state change. - As of second enhancement in the list there was a neccessary refactoring made, few sad operations were taken out from ptf script and implemented in pytest. Old sad path wasn't changed to support ansible compatiblity. This is partial refactoring only for enabling us to implement first two items from this list. Other sad operations need to be taken to pytest as well in the future. #### How did you verify/test it? Running warm_reboot_multi_sad.
…terfaces (#4758) Fix advance-reboot SAD cases in testcase test_warm_reboot_multi_sad. The cause for failure was when incorrect participating member interfaces in a LAG are selected. This issue supposedly started after #3853 As part of PR 3853, some of the SAD case handling was moved out of ptf-tests. Due to this, the port selection for some cases is not done on ptf scripts. Specifically, presently the traffic is sent through even the LAGs which are brought down. This leads to part of the downstream traffic always being dropped - ultimately leading to warm-up failure.
…terfaces (sonic-net#4758) Fix advance-reboot SAD cases in testcase test_warm_reboot_multi_sad. The cause for failure was when incorrect participating member interfaces in a LAG are selected. This issue supposedly started after sonic-net#3853 As part of PR 3853, some of the SAD case handling was moved out of ptf-tests. Due to this, the port selection for some cases is not done on ptf scripts. Specifically, presently the traffic is sent through even the LAGs which are brought down. This leads to part of the downstream traffic always being dropped - ultimately leading to warm-up failure.
…r state in sad scenarios
Signed-off-by: Stepan Blyschak [email protected]
Description of PR
Summary: Enhance warm reboot sad path test cases.
Fixes #3683
Type of change
Back port request
Approach
What is the motivation for this PR?
The motivation is to fill the test gap to test different ports admin and operationally down before warm-reboot and check their status after warm-reboot. Sanity will make sure the ports changed are back up.
How did you do it?
How did you verify/test it?
Running warm_reboot_multi_sad.
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation